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We report on the status of the dynamical overlap QCD simulation project by the JLQCD collab- 
oration. After completing two-flavor QCD simulation on a 16 3 x 32 lattice at lattice spacing a ~ 
0.12 fm, we started a series of runs with 2+1 flavors. In this report, we describe an outline of our 
algorithms, parameter choices, and some early physics results of this second phase of our project. 
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1. Dynamical overlap fermion 

The JLQCD collaboration is carrying out a large scale lattice QCD simulation using the over- 
lap fermion formulation for sea quarks. (An overview of the project has been given at this con- 
ference by Matsufuru [jl|].) The first phase of the project was a two-flavor QCD simulation on a 
16 3 x 32 lattice at a lattice spacing a ~ 0.11-0.12 fm. The HMC simulations have been completed 
accumulating about 10,000 molecular dynamics trajectories for six values of sea quark mass rang- 
ing m s j 6-m s . Preliminary reports of this project were already presented at Lattice 2006 [Q, [| ^ ||] ; 
at this conference we have presented physics results for pion masses and decay constants [^], pion 
form factor [f7|], kaon B parameter [Q], and topological susceptibility [Q]. We have also performed 
simulations in the £-regime by reducing the sea quark mass down to 3 MeV. This lattice has been 
used for the analysis of low-lying eigenvalues of the overlap-Dirac operator [11, [12|, 13] and for a 



calculation of meson correlators in the £ -regime [jl4|]. The second phase of the project is to include 
strange quark as dynamical degrees of freedom: a 2+1 -flavor QCD simulation with the overlap 
fermion. We aim at producing dynamical lattices of size 16 x 48 at around the same lattice spac- 
ing a ~ 0.11-0.12 fm. 

We use the Neuberger's overlap-Dirac operator [15, 16] 

m + -J + (m - -j y 5 sgn [/%(-«*(,)]. (1.1) 

The choice for the kernel operator is the standard Wilson fermion with a large negative mass mo = 
1.6. For the gauge sector we use the Iwasaki gauge action together with extra Wilson fermions and 
ghosts producing a factor 

det[H w (-m ) 2 ) 
det[H w {-m ) 2 + n 2 ] { ■ ' 

in the partition function such that the near-zero modes of Hy/(—mo) is naturally suppressed [|T7|]. 
This term is essential for the feasibility of dynamical overlap fermion simulation, since it substan- 
tially reduces the cost of the approximation of the sign function in (|TT|). Although it prevents us 
from changing the topological charge during the molecular dynamics evolutions, its systematic ef- 
fect can be understood as a finite size effect and can be estimated (and even corrected) once the 
topological susceptibility is known [Jig]. The topological susceptibility is in fact calculable on the 
lattice with a fixed topology as demonstrated in [|9|, [K 



2. Algorithms 

For the calculation of the sign function in ( |Tt| ) we use the rational approximation 

sgn [H w ] =H w lpo + Y, Tjr 1 — ) (2. 1) 

with the Zolotarev's optimal coefficients pi and q\. This is applied after projecting out a few low- 
lying modes of Hw- Typically, accuracy of order 10 -1 - 7-8 ' is achieved with ,/V = 10. The multiple 
inversions for (H w + qi)~ i can be done at once using the multi-shift conjugate gradient (CG). 
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The inversion of D(m) is the most time-consuming part in the HMC simulation. In the two- 



flavor runs, we mainly used the nested CG with relaxed residual for the inner CG fll9Q. In the 
2+1 -flavor runs, we use the five-dimensional solver as explained in the following. 

By the Schur decomposition the overlap solver can be written in the form (for N = 2 for 
example) 



1L 22] 



V 



Hw —\fq~2 







\ 


/02+\ 




f \ 


—yjqi —Hw 




\fP~2 




02- 







H w 









01+ 









-H w 


y/pl 




01- 







y/pi 


y/Pl 


R75 + PqH w 


) 


V ¥4 ) 




w 



(2.2) 



where R = (1 +m)/(\ —m). By solving this equation we obtain a solution for D(m)04 = X4 



with D(m) approximated by the rational function. The matrix in (2.2) can be viewed as a five- 



dimensional (5D) matrix. An advantage of solving (2.2) is that one can use the even-odd pre- 
conditioning. Namely, rather than solving the 5D matrix M, we may solve a reduced matrix 
(1 — M~ e { M eo M~QM oe )\if e = %' e , where even/odd blocks of M are denoted by M eo , M ee , etc. The 
inversion M~ e l (or M~ l ) can be easily calculated by the forward (or backward) substitution involv- 
ing the 5D direction. 

The low-mode projection can be implemented together with the 5D solver. The lower-right 
corner is replaced by 



(tn\ 
m o + y) £sgn(Ay)v;®vJ, 

1 7=1 



(2.3) 



where Ph is a projector onto the subspace orthogonal to the A^,, low-lying modes: Ph = 1 — 
Y!jZi s g n (^i') v ; ® v y- Then, the inversion of M ee ^ becomes non-trivial, but can be calculated 
cheaply because the rank of the matrix is only 2(N ev + 1); the subspace is spanned by x e , Ysx e , Vj e , 
YSVje U = 1,. 

We compare the performance of the 5D solver with the relaxed CG in 4D. The elapsed time 
to solve the 5D equation is plotted in Figure [l] as a function of quark mass m. The lattice size is 
16 x 48 and the measurement is done on a half -rack (512 nodes) of the BlueGene/L supercomputer 
(2.7 TFlops peak performance). Data for N = 10 is connected by lines for both 4D and 5D solvers. 
Evidently, the 5D solver is faster by about a factor of 3-4. Increasing the number of degree of the 
rational approximation requires more computational cost for both 4D and 5D. For the 5D case, the 
cost is naively expected to be proportional to N, but the actual measurement shows slower increase, 
which indicates some overhead due to the construction of low-mode projector etc. 



3. Odd number of flavors 



Introduction of the pseudo-fermions for dynamical quark flavors is the starting point of HMC. 
For the two-flavor case, this is straightforward by writing detD 2 as f [d0][d0']exp[— |.ff 0| ], 
where H = 75D. The same trick applied for one flavor introduces D -1 / 2 in the pseudo-fermion 
action, which requires a method to calculate the inverse square-root of the Dirac operator. (For such 
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Figure 1: Comparison of solver performance. Data for N = 10 is connected by lines: 4D (red squares) and 
5D (black circles). 



algorithms, see [23], for example.) For the overlap-Dirac operator this problem can be avoided 
as follows [24, 25]. Thanks to the exact chiral symmetry of the overlap fermion, H 2 = (jsD) 2 



commutes with 75, and therefore can be decomposed into positive and negative chirality subspaces: 

H 2 = P + H 2 P+ + P H 2 P =Q+ + Q , (3.1) 

where P± = (1 ±/5)/2. Then, its determinant is factorized, detH 2 = det<2+ -det<2 . Since Q + 
and Q share the eigenvalues except for those of zero-modes, detH 2 = (det<2+) 2 = (det<2_) 2 up 
to the zero-mode contribution, which is a trivial factor for the topology fixed simulations. In order 
to simulate one flavor, one can just pick one chiral sector of H 2 . 

Thus, we introduce a pseudo-fermion field for the one-flavor piece as Spf 1 = £ r (f)^ (x) Q~ 1 (f> a (x) , 
where a can either be + or — representing the chiral sector. At the beginning of each HMC trajec- 

— 1/2 

tory, we refresh <f> a (x) from a gaussian distribution t, (x) as (j) a (x) = Q a £, (x). This step requires 
a calculation of the square-root of Q a , which is done using the rational approximation. Calculation 
of the molecular-dynamics force is straightforward: one can simply project onto the chiral sector 
a in the calculation of the force from H 2 . 



4. Runs 

The 2+1-flavor runs are done at j8 = 2.30, which is the same value as our main two-flavor runs. 
The unit trajectory length T is set to 1.0, twice longer than the two-flavor runs. Our choice of the 
sea quark mass parameters are summarized in Table p]. The up and down quark mass m uc i ranges 
from m s down to ~ m s /6 as in our two-flavor runs. For the strange quark mass we take two values 
aiming at interpolating to the physical strange quark mass. 

At the time of the lattice conference, the runs proceeded to 500-1,000 HMC trajectories de- 
pending on the mass parameter. One trajectory takes about 1-2 hours on one rack (1,024 nodes) of 
BlueGene/L (5.7 TFlops peak performance). The acceptance rate is kept around 80-90% for each 
run. 
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Table 1: Sea quark mass parameters 
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Figure 2: Molecular dynamics time evolution of the number of CG iterations in the calculation of the HMC 
Hamiltonian. Data at m uc [ = 0.025 (left) and 0.050 (right) with m s = 0.100. In the plot "ovl" denotes up 
and down quarks, while "ov2" corresponds to strange. "PF2" stands for the inversion with the original sea 
quark mass, and "PF1" is for the preconditioner, whose mass is chosen to be 0.4 for m q > 0.035 or 0.2 for 
m q < 0.025. 



Figure || shows the number of the (two) 5D CG iteration in the calculation of the HMC Hamil- 
tonian. As expected the calculation for the two-flavor piece is dominating the calculation. 

Measurements of physical quantities are done at every 5 trajectories, so far only for the m s = 
0.100 lattices. In order to use in the low-mode preconditioning and low-mode averaging, we are 
calculating 80 pairs of low-lying eigenmodes of the overlap-Dirac operator. The lattice spacing as 
determined through the Sommer scale ro (= 0.49 fm) is plotted in Figure |3| for both 2- and 2+1- 
flavor lattices. At the same /3 value (= 2.30) the lattice spacing decreases as more dynamical flavors 
are included. 

Preliminary results for pion and kaon mass squared and decay constant are shown in Figure |j. 
Data at m s = 0.100 are plotted as a function of sea quark mass. Although the statistics is still low 
(< 1,000 trajectories for each sea quark mass), reasonably precise data are obtained using the low 
mode averaging technique. Detailed analysis with the chiral extrapolation is yet to be done after 
accumulating more statistics. 



Numerical simulations are performed on Hitachi SRI 1000 and IBM System Blue Gene Solu- 
tion at High Energy Accelerator Research Organization (KEK) under a support of its Large Scale 
Simulation Program (No. 07-16). This work is supported in part by the Grant-in- Aid of the Min- 
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Figure 3: Lattice spacing as a function of sea quark mass. At j3 = 2.30, two-flavor data (black circles) are 
plotted together with a line of chiral extrapolation. 2+1 -flavor data are plotted for both m s = 0.100 (blue 
squares) and 0.080 (blue triangles). A quenched result at the same j3 value is shown by a red band. 
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Figure 4: Preliminary results for pion and kaon mass squared (left) and their decay constants (right) as a 
function of sea quark mass. 

istry of Education, Culture, Sports, Science and Technology (No. 17740171, 18034011, 18340075, 
18740167, 18840045, 19540286 and 19740160). 
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